Vor. 59, No. 2 


Marca 1962 


Psychological Bulletin 


TECHNIQUES FOR THE STUDY OF LEARNING 
IN ANIMALS: 


ANALYSIS AND CLASSIFICATION! 


M. E. BITTERMAN 
Bryn Mawr College 


Although many different tech- 
niques for the study of learning in 
animals have been developed in the 
60 years or so since the problem of 
animal intelligence first was brought 
into the laboratory, their interrela- 
tions never have been carefully de- 
fined. Crude dichotomies have been 
proposed—‘“‘respondent _condition- 
ing’’ versus “operant conditioning” 
by Skinner (1935, 1937), ‘‘classical 
conditioning’ versus “instrumental 
conditioning” by Hilgard and Marquis 
(1940)—-and, more recently, a tri- 
chotomy—‘‘classical conditioning” 
versus “instrumental conditioning” 
versus ‘‘selective learning’’ by Spence 
(1956)—-but the diversity of method 
is too great to be encompassed in 
any such one-way analysis. While 
certain differences among the tech- 
niques to be classified must be ignored 
if the number of categories is to be 
smaller than the number of tech- 
niques, the quest for parsimony seems 
to have been carried too far. 

Classification is not merely a mat- 
ter of taste. When one can find no 

1This paper grows out of a program of 
research on the comparative psychology of 
learning supported by Contract Nonr 2829 
(01) with the Office of Naval Research and by 
Grant M-2857 from the United States Public 
Health Service. Its reproduction in whole or 
in part is permitted for any purpose of the 
United States Government. 


objective basis for evaluating the 
conviction that a given difference in 
technique should be stressed or that 

another safely may be disregarded, it — 
is only because the proper experi- 
ments have not been performed. Con- 
sider, for example, the question of 
whether the difference between flex- 
ion conditioning with avoidable as 
compared with unavoidable shock 
should be reflected in a classification 
of techniques. The answer is ‘‘Yes”’ 
for Hilgard and Marquis, who empha- 
size the contingency of reinforcement 
on response. They classify flexion 
conditioning as ‘‘classical’’ when 
shock is unavoidable and as “in- 
strumental’” when shock is avoid- 
able. The answer is ‘‘No’’ for Spence, 
who.emphasizes the degree of control 
afforded the experimenter over the 
appearance of the response to be 
learned. Ignoring the contingency of 
shock upon failure of response to the 
CS, Spence treats avoidance condi- 
tioning as a special case of classical 
conditioning in which the pattern of 
reinforcement gradually shifts. from 
consistent to intermittent. Such a 
disagreement surely need not remain 
long in the realm of opinion. One 
has only to compare the behavior of 
an animal trained with avoidable 
shock and that of a control animal 
trained with shock that is unavoid- 


81 


82 M, E. BITTERMAN 


able but simply withheld on which- 
ever trials the first animal avoids; if 
response contingency is unimportant, 
the course of learning in the two ani- 
mals should be the same. 

In general, a classification of tech- 
niques may be treated as the expres- 
sion of a set of hypotheses about the 
functional significance of differences 
in technique—a distinction between 
two techniques as an assertion that 
they yield results which differ in 
some fundamental respect, and a 
failure to distinguish between two 
techniques as an assertion that they 
may be used interchangeably in the 
analysis of learning. This is not to 
say that a classification may not be 
preferred on historical, or on peda- 
gogical, or even on esthetic grounds, 
but only that a functional interpreta- 
tion is available which provides a 
basis for empirical evaluation. Meth- 
odological and functional considera- 
tions have, in fact, been linked rather 
closely in the past. Methodological 
distinctions have been taken as 
points of departure for dual-process 
analyses of learning, while strivings 
for a unitary conception have been 
reflected in the blurring of methodo- 
logical distinctions. One may even 
point to experiments designed ex- 
plicitly to provide a functional com- 
parison of different methods (Youtz, 
1938a, 1938b, 1939), although the 
empirical study of methodological 
interrelations certainly has not been 
carried very far. 

Functional considerations play a 
central role in the classification to 
be offered here, which grows out of a 
program of comparative research 
(Bitterman, 1960). The first step in 
the program is to assess the phyletic 
generality of certain theoretically 
significant phenomena of learning 
which have been established in work 
with the rat (hitherto the principal 
subject of research on learning), and 


to that end a variety of simple ani- 
mals must be studied under condi- 
tions analogous to those which have 
been used for the study of the rat; 
but what are ‘‘analogous” condi- 
tions? Clearly, the answer to this 
question requires some hypotheses 
about the essential properties of the 
various techniques which have been 
used for the rat. As will later be 
indicated, the comparative enter- 
prise not only motivates further 
methodological analysis but consti- 
tutes a new source of data in terms of 
which the outcome may be evalu- 
ated. 


THORNDIKIAN SITUATIONS 


It seems reasonable to begin the 
analysis with a set of closely inter- 
related techniques which date back 
to the turn of the century and which 
have yielded. most of the information 
on which contemporary conceptions 
of animal learning are based. The 
adjective Thorndikian is appropriate 
both because of Thorndike’s pioneer- 
ing role in their development and 
because their operation is predicated 
on an empirical law of effect. Fa- 
miliar examples are the problem box 
and the maze. In each of these situa- 
tions, traditionally, the experimenter 
sets out to change behavior by manip- 
ulating its consequences, that is, by 
arranging a contingency between 
some motivationally significant state 
of affairs (“reinforcement’’) and the 
behavior in question. Thus, pulling 
a loop in a problem box or turning to 
the left in a T maze may be en- 
couraged with food or discouraged 
with shock. Indeed, the motivational 
significance of any event may be 
assessed in terms of its effect on the 
response which produces it in such a 
situation. An event that facilitates 
the occurrence of a response upon 
which it is contingent is called a 
reward; an event that has the op- 


STUDY OF LEARNING IN ANIMALS 83 


posite effect is called a punishment; 
while an event that produces no 
measurable change in behavior is 
motivationally insignificant or neu- 
tral. An aversive stimulus is one 
whose onset is punishing, and in 
what is called escape training the off- 
set of such a stimulus serves as a 
reward. 


Unitary and Choice Situations 


An important distinction between 
two main types of Thorndikian situ- 
ation may be illustrated by a com- 
parison of the problem box and the 
maze. In both these apparatuses, the 
animal is afforded numerous possi- 
bilities for action, one of which the 
experimenter chooses to reward. The 
main difference between them has to 
do with the treatment of irrelevant 
responses. In work with the prob- 
lem box, the experimenter may take 
some qualitative notice of the variety 
of fruitless activities which appear, 
but his interest is centered on the 
rewarded response and the readiness 
with which it comes to expression. 
The basic datum is time. In the 
maze, by contrast, the unrewarded 
behavior of the animal is structured 
more clearly; certain major alterna- 
tives to correct response are deline- 
ated, and the interest of the experi- 
menter is centered on their decline 
and disappearance. The basic datum 
is error. Time may be recorded, but 
it does not as clearly reflect progress 
in the choice among alternative 
courses of action, the aspect of selec- 
tive learning which the maze is so 
well suited to display. 

The designation unitary Thorn- 
dikian situation (or T-1 situation) 
will be used here for the problem 
box and for any other Thorndikian 
situation in which but a single course 
of action is defined and the readiness 
with which it comes to expression is 
measured, The designation Thorn- 


dikian choice situation (or T-2 situa- 
tion) will be used for the maze and 
for any other Thorndikian situation 
in which two or more incompatible 
courses of action are defined and 
choice among them is studied. The 
nature of the responses delineated 
and the general properties of the en- 
vironments in which they appear are 
ignored in this classification. Thus, 
a problem box which offers a choice 
of manipulanda is classed with the 
maze as a T-2 situation, while the 
runway is classed as a T-1 situation 
despite its structural resemblance to 
the maze. The runway may, of 
course, provide a measure of error, 
as in the early works of Hicks (1911), 
who plotted the learning of a cul- 
less maze in terms of retracing, while 
the potentialities of the maze for 
the study of choice may be ignored, 
as in the early work of Thorndike 
(1898), who, measuring only time, 
used the maze as though it were 
just another problem box. In such 
cases, the classification is based on the 
use to which the apparatus actually is 
put in a given experiment. For the 
most part, however, contradictions 
between potentiality and use are rare. 
An investigator interested in choice 
among alternative courses of action 
is not likely to use a runway, nor, un- 
less he is interested specifically in 
choice among alternative courses of 
action, is he likely (today) to use a 
maze. 

Both T-1 and T-2 situations may 
be ‘“‘chained.”” The most common 
example of a chained T-2 situation 
is the maze of many choice-points, 
once very much the mode, but rarely 
encountered today, perhaps because 
of the conviction, expressed by 
Lashley (1918), that the single-unit 
maze is quite as sensitive as the 
multiple-unit maze to the effects of 
significant variables and much less 
costly in time and effort. (The two 


84 M. E. BITTERMAN 


kinds of apparatus are not, of course, 
fully equivalent; certain problems— 
such as that of correction versus non- 
correction, first studied by Lashley— 
arise only when the number of 
choice points is reduced to one, while 
other problems—such as that of serial 
order—disappear.) Chained T-1 
situations never have been widely 
used. An example may be found ina 
string of problem boxes, each present- 
ing one manipulandum, with the first 
giving access to the second, the 
second to the third, and so on, until 
the reward finally is attained (Her- 
bert & Arnold, 1947). For certain 
purposes, conceivably, mixed chains 
(composed both of T-1 and of T-2 
units) might be used. 


Generalized and Discriminative 
Situations 


Each of the two types of Thorn- 
dikian situation already distinguished 
—unitary and choice—may occur in 
discriminative as well as in general- 
ized form. This new distinction, 
which is orthogonal to the first, will 
be conveyed by adding the letters g 
(for generalized) and d (for discrim- 
inative) to the symbols for unitary 
and choice: T-1g, T-1d, T-2g, T-2d. 
In a discriminative problem, the ex- 
perimental environment is varied 
systematically from trial to trial, and 
with it the consequences of response, 
the capacity of the animal to dis- 
criminate the change being inferred 
from a corresponding variation in be- 
havior. In a generalized problem, 
there may be some variation in the 
experimental environment from trial 
to trial (intentional or unintentional), 
and there may be some variation in 
the consequences of response (as in 
work on partial reinforcement), but 
there is (by definition) no correlation 
between the two kinds of change, and 
hence there is no objective basis for 
systematic variation in behavior. 


In the simplest T-1d case, a single 
defined response is rewarded under 
one set of conditions but not re- 
warded (or punished) under another 
set of conditions, and the readiness 
with which the response comes to ex- 
pression under the two conditions is 
compared. For example, response in 
a single-window jumping apparatus is 
rewarded when a white card is dis- 
played but punished when the card 
displayed is black (Solomon, 1943). 
Performance in a T-1d problem may 
be expressed in terms of ‘‘error,’”’ but 
a temporal criterion is implied. For 
example, in an early experiment by 
Thorndike (1898), cats were fed for 
climbing to the top of their cage in 
response to the words “I must feed 
those cats,’’ but not for making the 
same response to the words ‘“‘Tomor- 
row is Tuesday,” an error being re- 
corded whenever they climbed up 
(promptly) to the second phrase or 
failed (in a reasonable period of 
time) to climb up in response to the 
first. Similarly, Grice (1949), work- 
ing with another T-1d situation, 
computed the median response time 
for a series of trials and counted as an 
error any response to the negative 
stimulus faster than the median or 
any response to the positive stimulus 
slower than the median. A clear dis- 
tinction should be made between 
error thus defined and erroneous 
choice in a T-2d situation. 

In a T-2d problem, two or more 
alternative responses are defined— 
two in the simplest case. One of the 
responses is rewarded and the second 
unrewarded (or punished) under a 
given set of conditions, while the 
consequences of the two courses of 
action are reversed under another set 
of conditions, and erroneous choices 
are counted. For example, in a con- 
ventional jumping apparatus, a jump 
to the right window is rewarded 
when the card in the right window is 


STUDY OF LEARNING IN ANIMALS 85 


white and the card in the left window 
is black, but a jump to the left win- 
dow is rewarded when the positions 
of the two cards are interchanged; or, 
in the same apparatus, response to 
the right window is rewarded when 
two white cards are displayed, but 
response to the left window is re- 
warded when two black cards are dis- 
played. (Problems of the first kind 
have been termed “simultaneous” 
while problems of the second kind 
have been termed ‘‘successive,’’ an 
adjective applied as well to T-1id 
problems; considerable confusion has 
resulted from the failure to distin- 
guish between T-id and successive 
T-2d problems.) The so-called 
“higher order” discriminations—od- 
dity, matching-from-sample, and 
multiple choice—also may be clas- 
sified as T-2d problems, although 
they seem to make demands which 
go far beyond those of the simpler 
problems first exemplified. The T-1d 
and T-2d categories actually are 
rather coarse ones which themselves 
invite careful analysis and subdi- 
vision. 

Like T-2g situations, T-2d situa- 
tions may be ‘“‘chained’’—as when 
an animal is required to make a series 
of choices based on brightness before 
the reward is attained (Stone, 1928). 
Meaningful T-1d chains also are pos- 
sible, although no instance of such a 
chain is to be found in the literature. 
For example, response to a manipu- 
landum in one unit gives immediate 
access to the next unit when the 
positive stimulus is present; when 
the negative stimulus is present, ac- 
cess to the next unit is given after a 
predetermined period of time whether 
or not the animal responds. 


Discrete and Continuous Situations 


There has been little clarity on the 
relation of Skinner’s technique to 
other techniques for the study of 


learning in animals. It has been 
asserted by Woodworth (1938), for 
example, that the Skinner box ‘‘brid- 
ges the gap” between the problem 
box and the classical conditioning 
situation, and a similar view is met 
again in Spence (1956), who places 
the Skinner box on a continuum at a 
point intermediate between the meth- 
ods of Thorndike and Pavlov; but 
the notion of continuity is difficult to 
justify. Skinner (1935, 1937) cer- 
tainly has succeeded very well in 
drawing a sharp line between his 
method and that of Pavlov on the 
basis of criteria which fail to dis- 
tinguish his method from that of 
Thorndike. 

Skinnerian situations are Thorn- 
dikian situations as the term is de- 
fined here. The original Skinner box 
differs from the older problem box 
only in that it delivers food to the 
response compartment (instead of 
admitting the animal to a separate 
feeding compartment) when the de- 
fined response is made, a feature 
which eliminates handling of the ani- 
mal between trials. Equipped with a 
retractable lever, which is introduced 
to begin each trial and withdrawn 
after response, the Skinner box may 
be used in exactly the same manner 
as the older problem box; in fact, a 
retractable manipulandum which de- 
livered food to the responding animal 
was developed for the monkey by 
Thorndike himself (1901). Skinner 
(1932), of course, has preferred to use 
his apparatus as a ‘‘repeating’’ prob- 
lem box—his own adjective—invert- 
ing the traditional measure of per- 
formance, and substituting for time 
per response on discrete trials number 
of responses per unit time (rate of 
response) to a continuously available 
lever. Either way, a Skinner box 
containing one lever may be classified 
as a T-1 situation. A single response 
is delineated, its consequences are 


86 M. E. BITTERMAN 


manipulated, and the readiness with 
which it comes to expression is 
measured. With two levers and the 
study of choice, the Skinner box be- 
comes a T-2 situation. 

It seems reasonable, nevertheless, 
to make a formal! distinction between 
Thorndikian situations in which la- 
tencies or choices are measured in 
discrete trials and their Skinnerian 
counterparts in which rates of re- 
sponse are measured under condi- 
tions of continuous opportunity to 
respond. Situations of the first kind 
will be designated as discrete, while 
those of the second kind will be desig- 
nated as continuous, and for symbolic 
purposes the subscripts d (for dis- 
crete) or c (for continuous) will be 
added to the T for Thorndikian, as, 
for example, in Ta-2g (discrete, choice, 
generalized) or in T,-1d (continuous, 
unitary, discriminative). The dis- 
crete-continuous distinction reflects 
the hypothesis that rate of response, 
despite its close mathematical rela- 
tion to latency, has a functional sig- 
nificance which is to a certain extent 
unique, and some interesting evidence 
for this view comes from comparative 
studies of the effect of inconsistent 
reinforcement on resistance to extinc- 
tion. In the rat, discrete and con- 
tinuous techniques both give the so- 
called paradoxical effect (greater re- 
sistance to extinction after incon- 
sistent than after consistent rein- 
forcement). In the fish, initial re- 
sistance to extinction is greater after 
consistent reinforcement in the dis- 
crete case (Longo & Bitterman, 1960; 
Wodinsky & Bitterman, 1959, 1960); 
but some as yet unpublished data 
show greater resistance to extinction 
after inconsistent reinforcement in 
the continuous case. Whether the dif- 
ference in outcome may be traced to 
a difference in the functional proper- 
ties of the two techniques, or whether 
it is a product of certain parametric 
differences between the two sets of 


experiments, remains to be deter- 
mined. The matter is introduced 
here only to suggest the possibility 
that techniques which are function- 
ally equivalent for one species may 
not be so for others. In this connec- 
tion it is worth noting, perhaps, that 
the potentialities of the rate measure 
seem to be realized fully only when 
inconsistency of reinforcement is 
introduced. 


A General Definition of Thorndikian 
Situations 


In each of the Thorndikian situa- 
tions considered thus far, a change in 
behavior is measured which springs 
from a contingency between some de- 
fined response and some motiva- 
tionally significant state of affairs, 
Experiments on latent learning sug- 
gest, however, that a Thorndikian 
situation may be characterized with- 
out reference either to the actual oc- 
currence of change in behavior or to 
the motivational significance of the 
consequences of response. In the T-1 
case, an investigator may set out de- 
liberately to minimize the motiva- 
tional significance of the consequences 
of response in an effort to minimize 
the extent of change in behavior. 
For example, a hungry rat is trained 
in a runway which leads to an empty 
end box or to one which contains only 
water. To arrange a set of end-box 
conditions which are entirely without 
motivational significance is not, of 
course, always very easy, but it can 
be done (Gonzalez & Diamond, 1960). 
In the T-2 case, the consequences of 
alternative responses, whether moti- 
vationally significant or not, may be 
balanced in an effort to forestall the 
development of a preference for one 
or the other response, For example, 
a hungry rat is run in a simple T 
maze with both end boxes empty, or 
one empty and the other containing 
only water, or one containing food 


STUDY OF LEARNING IN ANIMALS 87 


and the other both food and water; 
orarat thatis both hungry and thirsty 
is run in a T maze with one end box 
containing food and the other con- 
taining water. Such situations are 
intended merely to provide occasions 
for learning whose effects are esti- 
mated in later tests. The tests always 
involve a change in the motivation- 
al significance of the consequences 
of response: for example, food is 
added to a previously empty end box; 
or the end box is associated with food 
in direct feedings; or the prevailing 
condition of deprivation is altered, 
and with it the relevance of previ- 
ously encountered incentives. Never- 
theless, despite the careful attention 
which must be paid to motivational 
significance in evaluating the out- 
come of exposure to a Thorndikian 
situation, the situation itself may be 
defined without reference to motiva- 
tional significance. What is essential 
only is a contingency of some spect- 
jied event or circumstance on some 
measurable bit of behavior—a con- 
tingency arranged by an investigator 
who is interested in studying its 
effects on the animal.? 


2 No treatment of Thorndikian techniques 
would be complete without some mention of a 
set of situations closely related to the problem 
box (calling for string-pulling, rake-wielding, 
box-stacking, and the like) which figured 
prominently in the work of certain of Thorn- 
dike’s critics, beginning with Hobhouse 
(1901), who did not think that Thorndike’s 
apparatus provided a representative picture 
of animal intelligence. Designed to be fully 
“surveyable’’ (to conceal nothing from the 
animal) and, although simple in principle, to 
render ‘“‘chance’’ solutions unlikely, these 
(Hobhousian) situations present Thorndikian 
contingencies of a rather loose sort and may be 
used, like Thorndike’s problem boxes, to 
study the way in which the experience.of such 
contingencies affect subsequent behavior. 
Their principal use, however, has been in in- 
quiries into the ability of animals to discover 
appropriate modes of behavior in advance of 
reinforcement—that is, in quests for evidence 


of “productive” or ‘inferential’ as contrasted 


with “reproductive” or learned solutions. 


PAVLOVIAN SITUATIONS 


Well before Paviov’s experiments 
on conditioning became widely 
known, other investigators were led 
quite independently, by aninterest in 
associative learning, to experiments 
of essentially the same kind. As far 
back as the turn of the century, a 
distinction was made between what 
was called ‘‘trial-and-error”’ or ‘‘selec- 
tive learning’’—the modification of 
behavior as a function of its conse- 
quences—and what was called ‘‘asso- 
ciation of stimuli’’ or ‘‘substitution”’ 
—the acquisition by one stimulus of 
some of the behavioral properties of 
a second stimulus as a function of the 
pairing of the two stimuli. Primarily 
concerned though he was with selec- 
tive learning, Thorndike (1898) him- 
self made use of paired stimulation; 
when a verbal statement such as “‘I 
must feed those cats” was followed 
regularly by the presentation of food, 
he reported, the words alone would 
bring the animals to the feeding place. 
It seems fitting nonetheless—in view 
of the scope of Pavlov’s (1927) con- 
tribution—that the method should 
bear his name. 

In the traditional Pavlovian experi- 
ment, as in the traditional Thorn- 
dikian experiment, the behavior of 
the animal is altered by the introduc- 
tion of some motivationally signifi- 
cant stimulus such as food or shock 
(“reinforcement’’), but there are im- 
portant differences. Ina Thorndikian 
experiment, reinforcement is con- 
tingent on response; doing one thing 
leads to food or to shock, doing 
another does not. In a Pavlovian 
experiment, reinforcement is sched- 
uled without regard to response; the 
experimenter does not set out to 
mold behavior in some _ predeter- 
mined fashion, but only to study the 
way in which the functional proper- 
ties of one stimulus are altered by 
virtue of its contiguity with another. 
Because their introduction is not 


88 M. E. BITTERMAN 


contingent on the animal's behavior. 
Pavlovian reinforcements cannot be 
treated as rewards or punishments in 
any meaningful manner, nor can re- 
wards and punishments be distin- 
guished in a Pavlovian experiment. 
Another difference between the two 
techniques is worth noting. In a 
Thorndikian experiment, the choice 
of the behavior which is to serve as 
the index of learning is independent 
of the choice of reinforcement; any 
of a large variety of responses which 
the animal is likely to make may be 
encouraged with food or discouraged 
with shock. In a Pavlovian experi- 
ment, the choice of reinforcement 
restricts the choice of a behavioral 
indicator; while the conditioned and 
unconditioned responses are not al- 
ways (as Pavlov thought) identical, 
the investigator must be guided in 
his search for evidence of learning 
by the functional properties of the 
reinforcing stimulus. Sharp as the 
distinction may be between the tradi- 
tional Thorndikian and Pavlovian 
procedures, it has been ignored very 
often by theorists preoccupied with 
the task of deriving all of the data 
of learning from the operation of 
a single process. Pavlov himself 
claimed, of course, that all instances 
of learning could be analyzed as in- 
stances of conditioning, although 
Thorndike, committed as he was to 
the generality of the law of effect, 
never was satisfied that Pavlov’s 
procedure could be cast in the same 
mold as his own. 

Coordinate with the unitary Thorn- 
dikian (or T-1) situation is the uni- 
tary Pavlovian (or P-1) situation, in 
which the tendency for a CS to pro- 
duce some defined effect is measured 
in terms of latency or magnitude. The 
defined effect may bea response which 
is reflexly elicited by the US, as in the 
salivary conditioning experiment, or 
something quite different, as when 


the rate of fixed-interval responding 
in a Skinnerian situation is depressed 
by shock and by a stimulus paired 
with shock (Estes & Skinner, 1941). 
A P-1 situation may be generalized 
(P-1g) or discriminative (P-1d); in 
the discriminative case, the CS is 
varied systematically from trial to 
trial and with it the likelihood that 
the US will be presented (as, for 
example, when a bright light always 
is followed by food but a dim light 
never is). With two unconditioned 
stimuli, each eliciting a different re- 
sponse, it is possible to set up a P-2 
situation, the Pavlovian analogue of 
the Thorndikian choice situation. 
(A T-2 situation may be consti- 
tuted with but a single reinforcer, 
which is another interesting difference 
between Pavlovian and Thorndikian 
techniques.) The discriminative (P- 
2d) case is perhaps the easier to con- 
ceive than the generalized (P-2g). 
For example, one CS is paired with 
acid introduced into the mouth of a 
dog, while another CS is paired with 
meat-powder (Pavlov, 1927). The 
P-2g case must involve some in- 
consistency of reinforcement (which 
is, or course, not true of T-2g). For 
example, a CS is paired with shock 
to the right forelimb on a random 
75% of trials and with shock to the 
left forelimb on the remaining 25% 
of trials. This is the Pavlovian ana- 
logue of a kind of T-2 situation in 
which there has been much interest 
of late. For example, a right turn at 
the choice point of a maze leads to 
food on a random 75% of trials while 
a left turn leads to food on the re- 
maining 25% of trials (Brunswik, 
1939), 

The discrete-continuous dichot- 
omy, which was developed in the 
analysis of Thorndikian procedures, 
seems to have no Pavlovian parallel; 
Pavlovian training is an affair of 
discrete trials. Nor does the notion 


STUDY OF LEARNING IN ANIMALS 89 


of “‘chaining’’ have any application 
to Pavlovian procedures. 

A Pavlovian situation, like a 
Thorndikian situation, may serve 
merely as an occasion for learning 
whose effects are measured in sub- 
sequent tests. One such case, well 
known to Pavlov, is that in which 
the presentation of CS and US is 
strictly simultaneous; only when the 
training procedure is altered can the 
effects of pairing be assessed.. A 
second is that of ‘‘sensory precondi- 
tioning’’—conceived originally by 
Thorndike (1898) himself as a check 
on the existence of ‘‘representations”’ 
—which is analogous to the Thorn- 
dikian experiment with consequences 
of response which are lacking in 
motivational significance; neutral 
stimuli are paired, then one is given 
some behavioral property, and the ef- 
fects of the pairing are estimated from 
response to the other. A third case 
is that in which attention is centered 
on the acquisition, not of response- 
eliciting properties, but of rewarding 
properties (Williams, 1929); for ex- 
ample, an animal is fed repeatedly in 
a distinctive box (that is, box and 
food are paired), after which access 
to the empty box is made contingent 
upon response in a Thorndikian situa- 
tion. In one variety of experiment 
which has considerable theoretical 
importance, the order of these ex- 
periences is reversed; the contin- 
gency of access to the empty box 
upon some response is displayed, 
after which the animal is fed in the 
box, and the effect on response is 
measured (Gonzalez & Diamond, 
1960). In general, then, a Pavlovian 
situation may be defined without 
reference either to the occurrence in 
that situation of any particular kind 
of behavioral change, or to the func- 
tional properties of the stimuli which 
are paired. What is essential only is 
@ sequence or conjunction of stimuli 


whose contiguity is independent of the 
animal's response. 


AVOIDANCE SITUATIONS 


The only learning situations which 
cannot be classified unequivocally as 
Pavlovian or Thorndikian are those 
which involve the avoidance of aver- 
sive stimulation. In them, Pavlovian 
and Thorndikian features are closely 
intertwined. On the one hand, a 
neutral stimulus is paired with an 
aversive stimulus, thereby acquiring 
certain arousing properties. The 
pairing is not, on the other hand, 
entirely independent of the animal’s 
behavior—the aversive stimulus is 
introduced only if the CS fails to 
elicit some defined response, whose 
likelihood of occurrence (low at the 
outset) the pairing serves to increase. 
This contingency of reinforcement 
on response is not displayed on the 
very first trial, as it is in a pure 
Thorndikian situation. In avoidance 
training, the contingency is a nega- 
tive one, which (since the mere pos- 
sibility of avoidance cannot influence 
the animal) does not become mani- 
fest until the Pavlovian procedure 
has taken effect. 

There is another Thorndikian con- 
tingency which operates in some 
(though not in all) avoidance situa- 
tions, this one making itself felt from 
the very first trial: termination of 
the aversive stimulus may be con- 
tingent on some defined response, 
often—but not always—the same re- 
sponse as that which avoids the 
aversive stimulus. In flexion condi- 
tioning, when shock to the limb is 
administered through a grid on which 
the limb of the animal rests, and 
when the scheduled duration of shock 
is substantial, flexion both escapes 
and avoids shock. In the shuttle box, 
too, the conditions of training may be 
such that changing compartments 
both escapes and avoids shock, al- 


90 M. E* BITTERMAN 


though, as Warner (1932) noted 
early, the response which escapes 
shock may be different from that 
which avoids it (for example, leaping 
over a hurdle as compared with crawl- 
ing under). It is possible, of course, 
to set up an avoidance situation in 
which there is no escape at all. In 
flexion conditioning, shock may be 
administered through a bracelet at- 
tached to the limb, and a control cir- 
cuit so arranged that the CR will 
forestall the shock but the UR will 
not alter its scheduled duration. In 
the shuttle box, the shock may be 
very brief, terminating quite inde- 
pendently of any response the animal 
may make to it (Hunter, 1935). 
Even without escape, however, there 
remains the contingency of aversive 
stimulation on failure of response to 
the CS, an essential feature of avoid- 
ance training which distinguishes it 
from Pavlovian training, while the 
paired stimulation which is respon- 
sible for the emergence of response to 
the CS distinguishes it from Thorn- 
dikian training. Avoidance training 
seems to require a major category of 
its own. 

In its most common use, the shut- 
tle box may be classified as an Ag-1g 
situation (A for avoidance); a single 
course of action is defined, and its 
latency is measured in discrete trials 
without systematic variation in sen- 
sory conditions. The corresponding 
discriminative (Ag-1d) situation also 
may be generated in the shuttle box; 
for example, a bright light is followed 
by shock unless the defined response 
is made, but a dim light never is 
followed by shock. In such a situa- 
tion, it may be noted, discrimination 
can progress only as the animal fails 
to respond to the dim light, since the 
consequences of response to the two 
lights are identical. (In a T-1d situa- 
tion, by contrast, the consequences 
of response to the stimuli to be dis- 


criminated are different, and dis- 
crimination therefore is facilitated by 
response to the negative stimulus; in 
a P-1d situation, discrimination may 
progress quite independently of re- 
sponse.) 

Choice among alternative courses 
of action also may be studied in 
avoidance situations. Suppose, for 
example, that shock from a grid in 
the floor of a T maze is scheduled x 
seconds after an animal is placed in 
the starting box. In the generalized 
(Aa-2g) case, shock is avoided by 
prompt entrance into the end box on 
the right, but not by entrance into 
the end box on the left. In the dis- 
criminative (Aq-2d) case, a turn to 
the right avoids shock when the stem 
of the maze is black, while a turn to 
the left avoids shock when the stem 
is white. Two unconditioned stimuli 
are not required to generate an A-2 
situation as they are to generate a P-2 
situation, but two unconditioned 
stimuli may be used. For example, 
one signal is followed by avoidable 
shock to the right limb, while a sec- 
ond is followed by avoidable shock 
to the left limb (James, 1947). 

The discrete-continuous dichotomy 
developed in the analysis of Thorn- 
dikian-situations is applicable also to 
avoidance training. An A,-lg situa- 
tion may be constituted in a modified 
Skinner box or a shuttle box. Ina 
design developed by Sidman (1953), 
no exteroceptive warning signal is 
used, but shock is scheduled every x 
seconds by a clock which the defined 
response resets. (The lack of an 
exteroceptive signal does not, of 
course, subvert the definition of 
avoidance training as originating ina 
quasi-Paviovian contiguity of stim- 
uli; as Pavlov himself showed, in- 
ternal processes correlated with the 
passage of time since the occurrence 
of a specified event may be cast in 
the role of CS), In the corresponding 


STUDY OF LEARNING IN ANIMALS 91 


discriminative (A,-1d) case, the clock 
which schedules shock runs only 
under one of two sensory conditions. 
Avoidance situations of the continu- 
ous type which do involve exterocep- 
tive signaling also are feasible. In the 
Ao-lg case, for example, shock from 
a grid in the floor of a Skinner box 
is scheduled x seconds after the onset 
of a light and avoided by response on 
a variable-ratio schedule. A,-2 situa- 
tions, both generalized and discrim- 
inative, may be generated when 
alternative courses of action are de- 
fined. 

Like Thorndikian situations, avoid- 
ance situations may be chained. Just 
as an animal may learn to run a 
simple T maze under threat of shock, 
so it may learn to run a multiple T 
maze. An example of chaining in an 
avoidance situation of the continuous 
type is the following: with the onset 
of the CS, response to one manipu- 
landum is followed, on a variable- 
ratio schedule, by access to a second 
manipulandum, response to which, 
again on a variable-ratio schedule, 
terminates the CS and avoids shock. 

Although the term implies threat 
of an aversive condition which the 
animal learns to forestall, avoidance 
training, like Thorndikian and Pav- 
lovian training, may be characterized 
without reference to the nature of the 
stimuli employed or to the occurrence 
of behavioral change. It would be 
possible, for example, to train an 
animal with some neutral stimulus 
rather than shock in a shuttle box 
designed to produce a substantial 
frequency of spontaneous crossing, 
and then to test for learning after the 
neutral stimulus has been paired with 
shock. Irrespective of outcome, the 
conception of such an experiment is 
sufficient to delineate what is here 
regarded as the essential feature of 
avoidance .training: a sequence of 
stimuli is scheduled with the occur- 


rence of the second contingent upon 
the failure of the animal to make some 
specified response to the first. 


TERMINOLOGY 


While there need be no detailed 
comparison of the classification here 
proposed with earlier ones, it may be 
worth while, in the interest of pre- 
serving whatever compatible usages 
may exist, to consider how well some 
of the broader methodological desig- 
nations which now are current will 
serve the needs of the new classifica- 
tion. Since current terminology 
derives from earlier classifications, 
the major differences in emphasis 
must become quite apparent in the 
process. 

The term ‘‘conditioning”’ usually 
is used for the kind of training here 
called Pavlovian, but that term also 
is used rather widely to designate 
techniques which are not here classi- 
fied as Pavlovian, and often as a 
synonym for “‘learning’’ itself. The 
term ‘‘classical conditioning”’ is closer 
to what is here intended by Pavlovian, 
although in some contexts it has a 
narrower meaning (suggesting a har- 
nessed animal) and in other contexts 
a broader one (encompassing avoid- 
ance). Avoidance remains a useful 
term, but “instrumental condition- 
ing’’ is too ambiguous, since it has 
been applied indiscriminately both to 
avoidance training and to pure 
Thorndikian training. The term 
“operant conditioning” is even more 
ambiguous; it has a narrow (Skin- 
nerian) sense in which it is tied to a 
questionable distinction between 
“elicited” and “‘emitted” behavior, as 
well as a more general sense in which 
it is equivalent to instrumental con- 
ditioning. The term “selective learn- 
ing’ has a pure Thorndikian conno- 
tation, but it seems to designate a 
process of learning rather than a 
method of studying it. 


92 M. E. BITTERMAN 


In general, there is little to salvage 
in the current terminology. Specific 
situational designations, such as maze, 
problem box, and runway, continue 
to be useful, but the broader classi- 


ficatory terms are unsuitable because _ 


they are geared to methodological 
dichotomy rather than to trichotomy. 
Even if dichotomy should in time 
give way to trichotomy, of course, it 
is likely that many of the older terms 
will continue to be used with altered 
meanings and with considerable con- 
sequent confusion. The terms for the 
subcategories here defined—unitary 
and choice situations, generalized and 
discriminative situations, discrete and 
continuous situations—fortunately do 
not compete with established usages 
and therefore create less opportunity 
for confusion, although it is possible 
that a clearer notation might be 


found. Reflection will show, how- 
ever, that complexity of notation is to 
a certain extent an inevitable con- 
sequence of the amount of informa- 
tion to be conveyed. 

It is natural that a new classifica- 
tion should require a new terminol- 
ogy, although a change in classifica- 
tion does not, of course, necessarily 
imply an advance in conception. 
Whether the classification here pro- 
posed represents an advance in think- 
ing about the interrelations among 
learning situations cannot now be 
told. Classification is more, ulti- 
mately, than a matter of taste, but 
there is little else on which to depend 
at the present time. It is to be hoped 
that arenewed concern with problems 
of classification will stimulate fur- 
ther research on methodological inter- 
relations. 


REFERENCES 


BitrerMan, M. E. Toward a comparative 
psychology of learning. Amer. Psycholo- 
gist, 1960, 15, 704-712. 

Brunswik, E. Probability as a determiner 
of rat behavior. J. exp. Psychol., 1939, 25, 
175-197. 

Estes, W. K., & SKINNER, B. F. Some quanti- 
tative properties of anxiety. J. exp, Psy- 
chol., 1941, 29, 390-400. 

GONZALEZ, R. C., & Diamonp, L. A test of 
Spence’s theory of incentive motivation. 
Amer. J. Psychol., 1960, 73, 396-403. 

Grice, G. R. Visual discrimination learning 
with simultaneous and successive presenta- 
tion of stimuli. J. comp. physiol. Psychol., 
1949, 42, 365-373, 

HERBERT, M. J., & ARNOLD, W. J. A reaction 
chaining apparatus. J. comp. physiol. 
Psychol., 1947, 40, 227-229, 

Hicks, V. C. The relative values of different 
curves of learning. J. anim. Behav., 1911, 
1, 138-156. 

Hicarp, E. R., & Marouis, D. G. Condi- 
tioning and learning. New York: Appleton- 
Century, 1940. 

Hosuovss, L. T. Mind in evolution. London: 
Macmillan, 1901. 

Hunter, W. S. Conditioning and extinction 
in the rat. Brit. J. Psychol., 1935, 26, 135- 
148. 


James, W. T. The use of work in developing 
a differential conditioned reaction of antag- 
onistic reflex systems. J. comp. physiol, 
Psychol., 1947, 40, 177-182. 

Lasatey, K. S. A simple maze: With data 
on the relation of the distribution of prac- 
tice to the rate of learning. Psychobiology, 
1918, 1, 353-367, 

Lonoo, N., & BiTterMAN, M. E. The effect 
of partial reinforcement with spaced prac- 
tice on resistance to extinction in the fish. 
a comp. physiol. Psychol., 1960, 53, 169- 
172. 

Paviov, I. P. Conditioned reflexes: An in- 
vestigation of the physiological activity of the 
cerebral cortex. London: Oxford Univer. 
Press, 1927, 

Srpman, M. Avoidance conditioning with 
brief shock and no exteroceptive warning 
signal. Science, 1953, 118, 157-158. 

Sxrnvn_Er, B. F. On the rate of formation of a 
conditioned reflex. J. gen. Psychol., 1932, 
7, 274-285, 

SxinneERr, B. F. Two types of conditioned re- 
flex and a pseudo type. J. gen. Psychol., 
1935, 12, 66-76. 

Sxinner, B. F. Two types of conditioned re- 
flex: A reply to Konorski and Miller. J. 
gen. Psychol., 1937, 16, 272-282. 

SoLomon, R. L. Latency of response as a 
measure of learning in a “‘single-door” dis- 


STUDY OF LEARNING IN ANIMALS 93 


crimination. Amer, J. Psychol., 1943, 56, 
422-432, 

Spence, K. W. Behavior theory and condition- 
ing. New Haven: Yale Univer. Press, 1956. 

Stone, C. P. A multiple discrimination box 
and its use in studying the learning ability 
of rats: I. Reliability of scores. J. genet. 
Psychol., 1928, 35, 557-573. 

THORNDIKE, E. L. Animal intelligence: An 
experimental study of the associative 
processes in animals. Psychol. Rev. monagr. 
Suppl., 1898, 2(4, Whole No. 8). 

THORNDIKE, E. L. The mental life of mon- 
keys. Psychol, Rev. monogr. Suppl., 1901, 
3(5, Whole No. 15). 

Warner, L. H. The asociation span of the 
white rat. J. gen. Psychol., 1932, 41, 57-89. 

Wituiams, K. A. The reward value of a con- 
ditioned stimulus. U. Calif. Publ. Psychol., 
1929, 4, 31-55. 


Woornskry, J., & BITTERMAN, M. E. Partial 
reinforcement in the fish. Amey. J. Psychol., 
1959, 72, 184-199. 

Woonsky, J., & BitrerMAN, M. E. Re- 
sistance to extinction in the fish after ex- 
tensive training with partial reinforcement. 
Amer, J. Psychol., 1960, 73, 429~434. 

Woopwortn, R.S. Experimental psychology. 
New York: Henry Holt, 1938. 

Youtz, R. E. P. The change with time of a 
Thorndikian response in the rat. J. exp. 
Psychol., 1938, 23, 128-140. (a) 

Youtz, R. E. P. Reinforcement, extinction, 
and spontaneous recovery in a non-Pavlo- 
vian reaction. J. exp. Psychol., 1938, 22, 
305-318. (b) 

Youtz, R. E. P. The weakening of one Thorn- 
dikian response following the extinction of 
another. J. exp. Psychol., 1939, 24, 294-304, 


(Received January 9, 1961) 


